IBM HR Analytics Employee Attrition and Performance Dataset

In this study, we analyze HR data available from kaggle.com

This data is fictional and it is created by IBM data scientists.

Categorical Parameters:

1 2 3 4 5
Education Below College College Bachelor Master Doctor
Environment Satisfaction Low Medium High Very High
Job Involvement Low Medium High Very High
Job Satisfaction Low Medium High Very High
Performance Rating Low Good Excellent Outstanding
Relationship Satisfaction Low Medium High Very High
WorkLife Balance Bad Good Better Best

This can be encoded as follows,

Loading the Dataset

First off, let's take a look at the dataset

A quick overview of data distribution:

Exploratory Data Analysis

Age

Business Travel

Department

Distance from Home

Education

Education Field

Environment Satisfaction

Hourly Rate

Job Involvement

Job Level

Job Roles

Job Satisfaction

Marital Status

Number of Companies Worked

Over Time

Percent Salary Hike

Performance Rating

Relationship Satisfaction

Stock Option Level

Total Working Years

Training Times Last Year

Work-Life Balance Score

Years at the Company

Years In Current Role

Years Since Last Promotion

Years With Current Manager


References

  1. Kaggle Dataset: IBM HR Analytics Employee Attrition & Performance
  2. Getting Started with Plotly in Python